Navigation in non-uniform density social networks 



O 

(N 

C 

03 



Yanqing Hu, Yong Li, Zengru Di, Ying Fan* 
Department of Systems Science, School of Management and Center for 
Complexity Research, Beijing Normal University, Beijing 100875, China 
(Dated: January 6, 2011) 

Recent empirical investigations suggest a universal scaling law for the spatial structure of social networks. It is 
found that the probability density distribution of an individual to have a friend at distance d scales as P (d) oc dr x . 
Since population density is non-uniform in real social networks, a scale invariant friendship network(SIFN) 
based on the above empirical law is introduced to capture this phenomenon. We prove the time complexity of 
navigation in 2-dimensional SIFN is at most 0(log 4 n). In the real searching experiment, individuals often resort 
to extra information besides geography location. Thus, real-world searching process may be seen as a projection 
of navigation in a ^-dimensional SIFN(fc > 2). Therefore, we also discuss the relationship between high and low 
dimensional SIFN. Particularly, we prove a 2-dimensional SIFN is the projection of a 3-dimensional SIFN. As 
a matter of fact, this result can also be generated to any fc-dimensional SIFN. 
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I. INTRODUCTION 

To understand the structure of the social networks in which 
we live is a very interesting problem. As part of the re- 
cent surge of interest in networks, there have been active 
research about social networksJll-0]. Besides some well 
known common properties such as small-world and commu- 
nity structure||3-0], much attention has been dedicated to nav- 
igation in real social networks. 

In the 1960s, Milgram and his co-workers conducted the 
first small- world experiment ifToll . Randomly chosen individ- 
uals in the United States were asked to send a letter to a partic- 
ular recipient using only friends or acquaintances. The results 
of the experiment reveal that the average number of interme- 
diate steps in a successful chain is about six. Since then, "six 
degrees of separation" has became the subject of both exper- 
imental and theoretical research lflll \12S\ . Recently, Dodds et 
al carried out an experiment study in a global social network 
consisting about 60,000 email users [13]. They estimated that 
social navigation can reach their targets in a median of five 
to seven steps, which is similar to the results of Milgram' s 
experiment. 

The first theoretical navigation model was proposed by 
KleinbergOEH]. He introduced an n x n lattice to model so- 
cial networks. In addition to the links between nearest neigh- 
bors, each node u is connected to a random node v with a 
probability proportional to d(u, v)~ r , where d(u, v) denotes the 
lattice distance between u and v. Kleinberg has proved that 
the optimal navigation can be obtained when the power-law 
exponent r equals to d, where d is the dimensionality of the 
lattice, and the time complexity of navigation in that case is 
at most <9(log 2 n). Since then, much attention has been ded- 
icated to Kleinberg's navigation model lfl6l - fl8ll . Roberson et 
al. studied the navigation problem in fractal networks, where 
they proved that r — d was also the optimal power-law ex- 
ponent in the fractal case|19]. Carmi, Cartozo and their co- 
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operators have provided exact solutions respectively for the 
asymptotic behavior of Kleinberg's navigation model lf20il2lll . 
More recently, the navigation probolem with a total cost re- 
striction has also been discussed, where the cost denotes the 
length of the long-range connections! 22, 23]. 

Meanwhile, recent empirical investigations suggest a uni- 
versal spatial scaling law on social networks. Liben-Nowell 
et al explored the role of geography alone in routing mes- 
sages within the LiveJournal social network !^ . They found 
that the probability density function (PDF) of geographic dis- 
tance d between friendship was about P(d) oc dT 1 . Adamic 
and Ada also observed the P(d) oc aT 1 law when investigat- 
ing the Hewlett-Packard Labs email network l25ll . Lambiotte 
et al analyzed the statistical properties of a communication 
network constructed from the records of a mobile phone com- 
pany j26ll . Their empirical results showed that the probabil- 
ity that two people u and v living at a geographic distance 
d(u, v) were connected by a link was proportional to d(u, v)~ 2 . 
Because the number of nodes having distance d to any given 
node is proportional to d in 2-dimensional world, so the prob- 
ability for an individual to have a friend at distance d should 
be P(d) oc d ■ d~ 2 = d . More recently, Goldenberg et al 
studied the effect of IT revolution on social interactions ll27ll . 
Through analyzing an extensive data set of the Facebook on- 
line social network, they pointed out that social communica- 
tion decrease inversely with the distance d following the scal- 
ing law P(d) oc d~ l as well. 

Such as in the LiveJournal social network, population den- 
sity is non-uniform in real social networks ll24ll . To deal with 
the navigation problem with non-uniform population den- 
sity, a scale invariant friendship network (SIFN for short) 
model based on the above spatial scaling law P(d) oc d~ x 
of social networks is proposed in this paper. We prove the 
time complexity of navigation in a 2-dimensional SIFN is at 
most 0(log 4 «), which indicates social networks is naviga- 
ble. Dodds et al have pointed out that individuals often re- 
sort to extra information such as education and professional 
information besides geography location in the real searching 
experiment! 13]. Considering this phenomenon, navigation 
process in real world may be seen as the projection of nav- 



igation in a higher dimensional SIFN. Therefore, we further 
discuss the relationship between high and low dimensional 
SIFN. Particularly, we prove that a 2-dimensional SIFN can 
be seen as the projection of any fc-dimensional SIFN(fc > 2) 
through theoretical analysis. 



H. NAVIGATION IN NON-UNIFORM DENSITY SOCIAL 
NETWORKS 

To deal with the non-uniform population density in real so- 
cial networks, we divide the whole population into small areas 
and give the following two assumptions. First, the population 
density is uniform in each small area. Second, the minimum 
population density among the areas is m, while the maximum 
is M. We set m > to guarantee that a searching algorithm 
can always make some progress toward any target at every 
step of the chain. 

Like Kleinberg's network (KN for short) and Liben- 
Nowell's rank-based friendship network (RFN for short), we 
employ an nxn lattice to construct SIFN. Without loss of gen- 
erality, we assume each node u has q directed long-range con- 
nections, where q is a constant] 15]. To generate a long-range 
connection of node u, we first randomly choose a distance d 
according to the observed scaling law P(d) oc <T X in social 
networks. Then randomly choose a node v from the node set, 
whose elements have the same lattice distance d to node u, 
and create a directed long-range connection from u to v. The 
lattice is assumed to be large enough that the long-range con- 
nections will not overlap. 

For simplicity, we set q = 1. Let S denote the set of all 
nodes, then the probability that u chooses v as its long-rang 
connection in SIFN can be given by eq.([T|). 
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where c(u,v) - \{x\d{u,x) - d{u,v),x e S}\ and d(u,v) de- 
notes the lattice distance between nodes u and v. Likewise, 
the probability that u chooses v as its long-rang connection in 
KN and RFN are given respectively by eq.(f2| and eq.©. 
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where rank u (v) = \{w\d(u, w) < d(u, v),x E S }| denotes 
the number of nodes within distance d(u, v) to node u in 
RFNldlll. Notice that, the number of nodes with a dis- 
tance d(u, v) in a £-dimensional(&: > 1) lattice is proportional 
to d(u, v)* _1 . Thus, a node u connects to node v with proba- 
bility proportional to d(u, v)~° does not mean P(d) oc d~" but 
P(d) oc d~ a+k ~ x instead. Therefore, Pr-^iu, v, k), /VsifnCk, v) 
and Pr RFN («, v) are exactly the same for any A;-dimensional lat- 
tice based network when population density is uniform. How- 
ever, SIFN always satisfies the empirical results P(d) oc d~ l 
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FIG. 1: Two strategies of sending message in a 2-dimensional SIFN. 
Strategy M, send the message directly to target / from the current 
message holder using Kleinberg's greedy routing strategy. At each 
step, the message is sent to one of its neighbors who is most close to 
the target in the sense of lattice distance. Strategy S, the message is 
first sent to a given node j using Kleinberg's greedy strategy and then 
to the target node t using the same strategy. Suppose we start from a 
source node s, after one step, the message reaches nodes A\ and B\ 
respectively with strategy M and S. Consider B t as the new source 
node, then we should get A2 and Bi respectively with strategies 3\ 
and S in the next step. 



in social networks compared with KN and RFN. Further, 
PfKN(u, v, k),PrsiFN(u, v) and /Vrfn(m, v) can be quite differ- 
ent when the population density is non-uniform. 

Since our 2-dimensional SIFN captures the non-uniform 
population density property in the real social networks, we 
purposefully divide the navigation process into two stages for 
simplicity. First send messages inside a small area and then 
among the areas. To analyze the time complexity of naviga- 
tion in a 2-dimensional SIFN, we first compare the follow- 
ing two searching strategies as shown in FIGQ] Strategy J{, 
send the message directly to target t from the current mes- 
sage holder using Kleinberg's greedy routing strategy. At each 
step, the message is sent to one of its neighbors who is most 
close to the target in the sense of lattice distance. Strategy 
S, the message is first sent to a given node j using Klein- 
berg's greedy strategy and then to the target node t using the 
same strategy. It can be proved that strategy J\ performs better 
than strategy S on average. Suppose we start sending message 
from the source node s, the message reaches nodes A\ and B\ 
respectively with strategy and S after one step. It is al- 
ways correct that lattice distance d(A\,f) < d(B\,t), because 
greedy routing strategy always choose the node most close to 
target t from its neighbors . According to the results of 1 1 8 , 2 1 ] , 
the longer the distance between a source and a given tar- 
get, the more is the expected steps. Thus we should have 
r(Ai -> t) < T(Bi -> t), where T{Ai t) and T{B X t) 
denote the expected delivery time to target t from A\ and B\ 
respectively. 

Let T(s — » j — » t) denote the expected delivery time from 
s to t via a transport node j, then we have T(s — > t) < T(s — > 
B\ — » f). Consider B\ as a new source node, then message will 



3 



reach A2 and B2 with strategies Ji and S respectively in the 
next step. Following the same deduction, we have T(B\ — > 
A2 — » f) < 7\Z?i — > Z?2 — > 0- Repeat this process until 
the message reaches the given node /' with strategy S, then 
we should have a monotone increasing sequence of expected 
delivery time {T(s -> B x -> t), T(s -> B 2 -» , • • • , T(i -> 
jf — > f) }. Therefore, we can obtain T(s —* t) < T(s — > j — » 
t), which means strategy J[ is better than strategy S. This 
analysis can be extended to any ^-dimensional SIFN. 

Based on the first assumption and the fact that SIFN is iden- 
tical to KN when population density is uniform, the expected 
steps spent in each small area using Kleinberg greedy algo- 
rithm is at most <9(log 2 n). Consider each small area as a node, 
we will get a new 2-dimensional weighted lattice. The weight 
(population) of the nodes is between m and M based on the 
second assumption. Thus we have 
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where c is a constant and Pr slFN (u, v) represents the proba- 
bility that area u is connected to area v in the new weighted 
lattice. 

We say that the execution of greedy algorithm is in phase 
j (j > 0) when the lattice distance from the current node to 
target t is greater than 2 ; and at most 2 /+I . Obviously, we 
have 



n 

n ~ 

2 d~ [ < 1 + I X~ x dx = 1 + logn < 21ogn. 



(5) 



Further, we define Bj as the node set whose elements are 
within lattice distance 2 ; + 2- /+1 < 2 i+2 to u. Let \Bj\ denote 



the number of nodes in set B ,, we should have 



\B j \>l + Y j i>2 2 J- 



(6) 



Suppose that the message holder is currently in phase j, then 
the probability that the node is connected by a long-range link 
to a node in phase j - 1 is at least (Mnr'2 logn • 4 • 2 2j+4 ) _1 . 
The probability tfr(x) to reach the next phase j - 1 in more than 
x steps can be given by 



= (1 - (Mm _1 21og« • 4 • 2 Zj+4 y l ) x 



(7) 



and the average number of steps required to reach phase j — 1 
is 



complexity of navigation in SIFN with strategy S is at most 
0{— log 4 ri). However, actual navigation process in real world 
should be carried out regardless of the above two assumptions, 
which indicates individuals should use strategy M. Based on 
the above analysis, strategy y{ performs better than strategy 
S on average. Therefore, the time complexity of navigation 
in 2-dimensional SIFN is at most (9(log 4 ri) with non-uniform 
population density. 



III. RELATIONSHIP BETWEEN HIGH AND LOW 
DIMENSIONAL SIFN 

The empirical results show individuals always resort to ex- 
tra information such as profession and education informa- 
tion besides the target's geography location when routing 
messages lfl3ll . Then, real navigation process in social net- 
works may be modeled with a higher dimensional SIFN. In the 
following, we will discuss the relationship between the high 
and low dimensional SIFN and prove that a 2-dimensional 
SIFN can be obtained by any ^-dimensional SIFN (k > 2). 
Particularly, we will provide the theoretic analysis for the case 
where k = 3. The analysis can be generated to any k dimen- 
sional cases. 

We employe a random variable D3 to denote the friendship 
distance in a 3-dimensional SIFN. For simplicity, a continuous 
expressions is used. Since, the long-range connections in 3- 
dimensional SIFN satisfies the above empirical law, the PDF 
of D3 can be expressed by 



P(D 3 =d)= 1 3, d m <d<d M 

In dM - In d m d 



(9) 



where d m and dia denote the minimum and maximum distance 
respectively in the 3-dimensional SIFN. 

We can obtain a 2-dimensional network model if we project 
a 3-dimensional SIFN to a 2-dimensional world. Similarly, a 
random variable D2 is used to denote the friendship distance 
in the new 2-dimensional network model. It is not difficult to 
understand that the condition for a 2-dimensional SIFN should 
be the PDF of D2 satisfies P(d) oc d~ x . Since D2 is the pro- 
jection of D3, then D2 can be seen as the product of D3 and 
X. Here random variable X is independent on D3 and its PDF 
can be given by eq.([T0l). 



P(X = x) = -,0 < x < A 
A 



(10) 



where < A < I. Finally, the PDF of D2 can be written as 



< x >- 



256Mlogn 



256Mlog« 



(8) 



Since the initial value of j is at most log n, then the expected 
total number of steps required to reach the target is at most 
O(flog 2 «). 

As a matter of fact, it means that we are using strategy S 
to send message in 2-dimensional SIFN when the navigation 
process is divided into the above 2 stages. Thus, the time 



P(D 2 = d) 
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< d < d m A 

d m A < d < dhjA 
d > d^A 
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When taking account of real social networks, dM is large 
enough that the term -j-j will approach its limit of 0. Mean- 
while, the term d m A can be neglected when compared with 
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dtiK because A < 1 and d m is relatively small. Thus the PDF 
of Z?2 can be simplified into P(d) oc dr x , which is identical to 
that of D3 in a 3-dimensional SIFN. 

Through theoretical analysis, we have proved a 2- 
dimensional SIFN can be seen as the projection of a 3- 
dimensional SIFN. Likewise, we can get a 2-dimensional 
SIFN from any fc-dimensional(fc > 2) SIFN. Notice that in- 
dividuals are always restricted on the 2-dimensional geogra- 
phy world even they possess extra information from other di- 
mensions. Thus, real-world searching process may be seen as 
the projection of navigation in a high dimensional SIFN. Our 
analysis indicate that SIFN model may explain the navigabil- 
ity of real social networks even take account of the fact that 
individuals always resort to extra information in real search- 
ing experiments. 

IV. CONCLUSION 

Recent investigations suggest that the probability distribu- 
tion of having a friend at distance d scales as P(d) oc d^ 1 . 
We propose an SIFN model based on this spatial property 
to deal with navigation problem with non-uniform popula- 



tion density. It has been proved that the time complexity of 
navigation in 2-dimensional SIFN is at most 0(\og 4 «), which 
corresponds to the upper bond of navigation in real social net- 
works. Given the fact that individuals are always restricted on 
the 2-dimensional geography world even they possess infor- 
mation of the higher dimensions, actual searching process can 
be seen as a projection of navigation in a higher A:-dimensional 
SIFN. Through theoretical analysis, we prove that the projec- 
tion of a higher A:-dimensional SIFN results in a 2-dimensional 
SIFN. Therefore, SIFN model may explain the navigability of 
real social networks even take account of the information from 
higher dimensions other than geography dimensions. 
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